MCA-NMF: Multimodal Concept Acquisition with Non-Negative Matrix Factorization

نویسندگان
چکیده

برای دانلود باید عضویت طلایی داشته باشید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

MCA-NMF: Multimodal Concept Acquisition with Non-Negative Matrix Factorization

In this paper we introduce MCA-NMF, a computational model of the acquisition of multimodal concepts by an agent grounded in its environment. More precisely our model finds patterns in multimodal sensor input that characterize associations across modalities (speech utterances, images and motion). We propose this computational model as an answer to the question of how some class of concepts can b...

متن کامل

Non-negative Matrix Factorization for Word Acquisition from Multimodal In- formation Including Speech

The current generation of automatic speech recognizers incorporates a lot of hard coded knowledge about how speech is structured. Yet children seem to discover the structure of speech and language from examples. A new computational method to discover lexical items with little or no supervision, based on non-negative matrix factorization (NMF) of cooccurrence counts of low-level acoustic events ...

متن کامل

Multimodal Image Collection Visualization Using Non-negative Matrix Factorization

In this paper we address the problem of generating an image collection visualization in which images and text can be projected together. Given a collection of images with attached text annotations, we aim to find a common representation for both information sources to model latent correlations among the collection. Using the proposed latent representation, an image collection visualization is b...

متن کامل

Multimodal voice conversion based on non-negative matrix factorization

A multimodal voice conversion (VC) method for noisy environments is proposed. In our previous non-negative matrix factorization (NMF)-based VC method, source and target exemplars are extracted from parallel training data, in which the same texts are uttered by the source and target speakers. The input source signal is then decomposed into source exemplars, noise exemplars, and their weights. Th...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: PLOS ONE

سال: 2015

ISSN: 1932-6203

DOI: 10.1371/journal.pone.0140732